18 research outputs found

    Using Temporal Subsumption for Developing Efficient Error-Detecting Distributed Algorithms

    Get PDF
    Distributed algorithms can use executable assertions derived from program verification to detect errors at run-time. However, a complete verification proof outline contains a large number of assertions, and embedding all of them into the program to be checked at run-time would make error-detection very inefficient. The technique of temporal subsumption examines the dependencies between the individual assertions along program execution paths. In contrast to classical subsumption, where all logical expressions to be examined are true simultaneously, an assertion need only be true when the corresponding statement in the distributed program has been executed. Thus, temporal subsumption based on the set of assertions derived from a verification proof and in combination with the set of all legal states in the system, allows for the removal of (partial) assertions along execution sequences. We assume a fault model of Byzantine (malicious) behavior, and therefore an individual process cannot check itself for faults. We assume that a non-faulty process will always perform the correct computation so that once external data (obtained through communication) has been verified, the local computation does not need to be checked. A non-faulty process can thus detect faults produced by a faulty process based on the information it receives from it

    Formal Generation of Executable Assertions for Application-Oriented Fault Tolerance

    Get PDF
    Executable assertions embedded into a distributed computing system can provide run-time assurance by ensuring that the program state, in the actual run-time environment, is consistent with the logical stage specified in the assertions; if not, then an error has occurred and a reliable communication of this diagnostic information is provided to the system such that reconfiguration and recovery can take place. Application- oriented fault tolerance is a method that provides fault detection using executable assertions based on the natural constraints of the application. This paper focuses on giving application-oriented fault tolerance a theoretical foundation by providing a mathematical model for the generation of executable assertions which detect faults in the presence of arbitrary failures. The mathematical model of choice was axiomatic program verification. A method was developed that translates a concurrent verification proof outline into an error-detecting concurrent program. This paper shows the application of the developed method to several applications

    A General Method for Maximizing the Error-Detecting Ability of Distributed Algorithms

    Get PDF
    The bound on component failures and their spatial distribution govern the fault tolerance of any candidate error-detecting algorithm. For distributed memory multiprocessors, the specific algorithm and the topology of the processor interconnection network define these bounds. This paper introduces the maximal fault index, derived from the system topology and local communication patterns, to demonstrate how a maximal number of simultaneous (Byzantine) component failures can be tolerated for a particular interconnection network and error-detecting algorithm. The index is used to design a fault-tolerant mapping of processes to processor groups such that the error-detecting ability of the algorithm is preserved for certain multiple simultaneous processor failures. 1 This work was supported in part by the National Science Foundation under Grant Numbers MSS9216479 and CDA-9222827, and, in part, from the Air Force Office of Scientific Research under contract numbers F49620-92-J-0546 and F4962..

    Efficient Run-Time Assurance in Distributed Systems Through Selection of Executable Assertions

    No full text
    Run-time assurance of a distributed system can be obtained by comparing, at run-time, the behavior of the program with the expected behavior described in the program's specification. Executable assertions, embedded into the program code, can determine when there are discrepancies, due to processor failures, between actual and expected behavior. Thus, there is no global monitoring scheme but processes will check each other. A non-faulty process will always perform correct computation. It can detect errors in other processes after receiving information from them and checking it against expected values by using executable assertions. In order to efficiently check programs at run-time, we need to determine how many assertions need to be used, where they need to be located, and what they need to check to ensure that all occurring errors can be detected. This paper introduces temporal subsumption to remove, from a given set of assertions for a specific program, the assertions which perform r..

    Using Temporal Subsumption for Developing Efficient Error-Detecting Distributed Algorithms

    No full text
    Distributed algorithms can use executable assertions derived from program verification to detect errors at run-time. However, a complete verification proof outline contains a large number of assertions, and embedding all of them into the program to be checked at run-time would make error-detection very inefficient. The technique of temporal subsumption examines the dependencies between the individual assertions along program execution paths. In contrast to classical subsumption, where all logical expressions to be examined are true simultaneously, an assertion need only be true when the corresponding statement in the distributed program has been executed. Thus, temporal subsumption based on the set of assertions derived from a verification proof and in combination with the set of all legal states in the system, allows for the removal of (partial) assertions along execution sequences. We assume a fault model of Byzantine (malicious) behavior, and therefore an individual process cannot c..

    Efficient Run-time Assurance in Distributed Systems through Selection of Executable Assertions

    No full text
    Run-time assurance of a distributed system can be obtained by comparing, at run-time, the actual behavior of a program with the expected behavior described in the program\u27s specification. Executable assertions, embedded into the program code, can determine when there are discrepancies between actual and expected behavior. There is no global monitoring scheme and error-detection will occur at the process level. We can assume that a non-faulty process will always perform correct computations. It can detect errors in other processes after receiving information from them and checking it against expected values using executable assertions. in order to efficiently check programs at run-time, we need to determine how many assertions need to be used, where they need to be located, and what they need to check to ensure that all occurring errors can be detected. This paper introduces temporal subsumption to remove, from a given set of assertions for a specific distributed program, the assertions which perform redundant checking. the remaining set of assertions is then the set necessary to provide run-time assurance. to subsume assertions, the flow graphs of the individual components of the distributed system are examined using a graph traversal algorithm. Temporal subsumption is a pre-processing step that creates a smaller set of assertions to be embedded into the program and to be checked at run-time. This makes error-detection at run-time less time-consuming and thus more efficient since redundant checking is avoided
    corecore